Integrated Analysis Identifies DPP7 as a Prognostic Biomarker in Colorectal Cancer

Simple Summary Colorectal cancer (CRC) is one of the most common cancers, which has a poor prognosis and is prone to recurrence and metastasis. DPP7 is a prolyl peptidase, which is an enzyme characterized by the feature of cleaving proline-containing peptides. We found that the expression level of DPP7 in CRC samples was significantly higher than that in adjacent non-tumor tissues by analyzing public colorectal cancer data and surgical specimens of CRC patients. Moreover, the increased expression of DPP7 is correlated with a higher stage of cancer and shorter overall survival, indicating the diagnostic value of DPP7 for CRC. Furthermore, functional annotations indicated that DPP7 is involved in neuroactive ligand–receptor interaction and olfactory transduction signaling. Our data demonstrated that DPP7 is a promising diagnostic and prognostic biomarker as well as a new therapeutic target for CRC. Abstract Colorectal cancer has a poor prognosis and is prone to recurrence and metastasis. DPP7, a prolyl peptidase, is reported to regulate lymphocyte quiescence. However, the correlation of DPP7 with prognosis in CRC remains unclear. With publicly available cohorts, the Wilcoxon rank-sum test and logistic regression were employed to analyze the relationship between DPP7 expression and the clinicopathological features of CRC patients. Specific pathways of differentially expressed genes were determined through biofunctional analysis and gene set enrichment analysis (GSEA). qPCR and immunohistochemical staining were used to determine DPP7 expression levels in surgical specimens. The public dataset and analysis of the biospecimens of CRC patients revealed that DPP7, in the CRC samples, was expressed significantly higher than in non-tumor tissues. Moreover, increased DPP7 was significantly associated with a higher N stage, lymphatic invasion, and shorter overall survival. Functionally, DPP7 is involved in neuroactive ligand–receptor interaction and olfactory transduction signaling. We identified a series of targeted drugs and small-molecule drugs with responses to DPP7. To conclude, DPP7 is a valuable diagnostic and prognostic biomarker for CRC and considered as a new therapeutic target.


Introduction
Colorectal cancer (CRC) remains the most common malignancy worldwide, and disease recurrence after surgery and metastasis are the major causes of the poor prognosis of CRC [1]. As highlighted in epidemiological studies, approximately 80% of CRC recurrence occurs within the first 3 years after primary tumor resection [2]. It is also instructive to note that the time interval from initial treatment to tumor recurrence is a vital prognostic factor 1.18.4, https://cran.r-project.org/web/packages/pROC/index.html, accessed on 30 June 2023). Logistic regression was used to assess the relationship between clinicopathological features and DPP7 expression, and univariate and multivariate Cox regression analyses were applied to assess the prognostic role of DPP7 in patients with CRC. Bonferroni correction was used for multiple comparisons. The survival curve was generated using the Kaplan-Meier method with prognostic data obtained from previous research [14]. A nomogram based on the results of the multivariate analysis with the rms package (Version 6.7-0, https://cran.r-project.org/web/packages/rms/index.html, accessed on 30 June 2023) was constructed to individualize the prediction of 1-, 3-, and 5-year survival.

Biological Functional Analysis and Gene Set Enrichment Analysis (GSEA)
The gene expression data for the CRC cases were collected from the TCGA COAD and READ databases. Expression profile data (HTSeq-Counts) between the high-DPP7expression group and the low-DPP7-expression group were analyzed and compared, and differentially expressed genes (DEGs) were further identified using the Wilcoxon rank-sum test. The threshold criteria were set as differences with a |log fold change| > 1.5 and adjusted p-value < 0.05. Each analysis step was repeated 1000 times for the reliability of the results. For functional or pathway items, adjusted p-values < 0.05 and a false discovery rate (FDR) threshold of 0.25 were used to identify significant differences [15]. The protein-protein interaction (PPI) network of DPP7 co-expressed genes was predicted based on the STRING (http://string-db.org; version 10.0) platform, and the interactions between proteins were also analyzed [16]. The main parameters were set as follows: the minimum required interaction score ("Low confidence (0.150)"), meaning of network edges ("evidence"), and size cutoff ("no more than 20 interactors"). Finally, the available experimentally determined DPP7-binding proteins were obtained, and gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis was carried out to present the possible biological functions and signaling pathways affected by DPP7 [17]. GSEA [18] was performed using the R-packet ClusterProfiler [19] to demonstrate the significant functional and pathway differences between the DPP7 high-expression and low-expression groups.

Correlation between DPP7 and Drug Response
We collected the IC 50 of 261 small molecules in 860 cell lines from Genomics of Drug Sensitivity in Cancer (GDSC, https://www.cancerrxgene.org/) and 481 small molecules in 1001 cell lines from the Genomics of Therapeutics Response Portal (CTRP, https://portals. broadinstitute.org/ctrp.v2.1/?page=#ctd2BodyHome), as well as the corresponding mRNA gene expression data, accessed on 30 June 2023. The mRNA expression data and drug sensitivity data were merged. To determine the strength of the association between gene expression and drug sensitivity, we used Pearson correlation analysis, adjusted for FDR.

RNA Extraction and Quantitative Real-Time RT-PCR
Thirty specimens of colorectal tumors and thirty corresponding normal tissues taken 10 cm from the tumors were obtained from Sir Run Run Shaw Hospital. This study was approved by the Ethics Committee of Sir Run Run Shaw Hospital (approval number: 20230629-023). Total RNA was extracted from the tissues with RNA Isolater Total RNA Extraction Reagent combined with chloroform, isopropanol, and 75% ethanol. cDNA synthesis was performed with the HiScript II 1st Strand cDNA Synthesis Kit (Vazyme, R212-01). A CFX96 Touch Real-Time PCR Detection System (BIO-RAD) with MagicSYBR Mixture (Cwbio, CW3008M) was used to perform the qPCR analysis. The expression level of DPP7 was normalized to the GAPDH expression level, and 2 −∆Ct was used to exhaust the DPP7 mRNA expression level in the colorectal tumor and adjacent non-tumor tissues. The primers were as follows: for GAPDH, forward 5 -GGAGTCAACGGATTTGGTCGT-3 , reverse 5 -TCTCGCTCCTGGAAGATGGT-3 ; and for DPP7, forward 5 -GAAGCGTTCCGACAGATC AAG-3 , reverse 5 -TCAGGTCCTTCTCGTCTGACA-3 .

Tissue Microarray and IHC
A tissue microarray of colorectal cancer (HColA180Su16) was purchased from Outdo Biotech (Shanghai, China). DPP7 antibody (NBP1-84986, Novusbio, Centennial, CO, USA) was used as the primary antibody. This study was approved by the Ethics Committee of Outdo Biotech (approval number: SHYJS-CP-1707004). From the tissue microarray of colorectal cancer, we obtained 99 colorectal tumor tissues to detect the expression level of DPP7, and the correlation between DPP7 expression and the clinicopathological features of CRC patients was analyzed. In total, 62 of the colorectal tumor tissues had the corresponding normal tissues (taken 10 cm from the tumor) and were used to compare the DPP7 expression level through a paired t-test. Immunohistochemical staining was scored according to the published methods [20]. Briefly, the staining scores are based on multiplying the staining intensity (no staining, 0; weak staining, 1; moderate staining, 2; strong staining, 3) and the percentage of positive cells (1-25%, 0; 26-50%, 1; 51-75%, 3; >75%, 4). The staining scores ranged from 0 to 12.
To confirm the conclusions from the TCGA dataset, we compared the expression of DPP7 in normal and CRC primary tissues based on GSE44076 dataset as external validation ( Figure S1). As summarized in Table 1, high DPP7 mRNA expression was significantly associated with an advanced N stage (p = 0.004) and lymphatic invasion (p < 0.001). Further analysis revealed no statistically significant difference in DPP7 expression levels between the N0 and N1 groups (p = 0.49) or the N1 and N2 groups (p = 0.094). However, a significant difference in DPP7 expression levels was observed between the N0 and N2 groups (p = 0.010). Moreover, no significant associations with the other clinicopathologic features were revealed. These data suggest that DPP7 is overexpressed in CRC tissues, and higher DPP7 expression is closely related to CRC progression and metastasis. features were revealed. These data suggest that DPP7 is overexpressed in CRC tissues, and higher DPP7 expression is closely related to CRC progression and metastasis.    features were revealed. These data suggest that DPP7 is overexpressed in CRC tissues, and higher DPP7 expression is closely related to CRC progression and metastasis.

Diagnostic and Prognostic Value of DPP7 Expression in CRC
We first analyzed the diagnostic value of DPP7 in colorectal cancer patients. The AUC values of DPP7 in the COADREAD, COAD, and READ databases were 0.914 (95% CI: 0.888-0.940, Figure 3A), 0.919 (95% CI: 0.890-0.949, Figure 3B), and 0.901 (95% CI: 0.848-0.955, Figure 3C), respectively. This demonstrated that DPP7 is a potential diagnostic biomarker for CRC. We further investigated the association between DPP7 expression levels and survival prognosis through Kaplan-Meier analyses. As shown in Figure 4, patients from the TCGA COADREAD dataset with high DPP7 expression levels a had shorter median OS ( Figure

Construction and Evaluation of a Predictive Nomogram with DPP7
To provide a method for quantitatively predicting the prognosis of CRC, we constructed a novel nomogram, integrating the aforementioned clinical characteristics independently associated with survival. A point scale was used to assign these variables to the nomogram based on multivariate Cox analysis (age, lymphatic invasion, residual tumor, and DPP7 expression; Figure 5B). The 1-year, 3-year, and 5-year survival probabilities for CRC patients were determined based on vertical lines from the total point axis down to the outcome axis. The C-index of the nomogram was 0.707 (95% confidence interval: 0.673-0.742) after 1000 bootstrap replicates. The bias-corrected line in the calibration plot was close to the ideal curve (i.e., the 45-degree line), indicating that there is good agreement between the predicted value and the observed value ( Figure 5C). Our result suggested that this nomogram is superior to a single prognostic factor in predicting long-term survival in CRC patients.

Construction and Evaluation of a Predictive Nomogram with DPP7
To provide a method for quantitatively predicting the prognosis of CRC, we constructed a novel nomogram, integrating the aforementioned clinical characteristics independently associated with survival. A point scale was used to assign these variables to the nomogram based on multivariate Cox analysis (age, lymphatic invasion, residual tumor, and DPP7 expression; Figure 5B). The 1-year, 3-year, and 5-year survival probabilities for CRC patients were determined based on vertical lines from the total point axis down to the outcome axis. The C-index of the nomogram was 0.707 (95% confidence interval: 0.673-0.742) after 1000 bootstrap replicates. The bias-corrected line in the calibration plot was close to the ideal curve (i.e., the 45-degree line), indicating that there is good agreement between the predicted value and the observed value ( Figure 5C). Our result suggested that this nomogram is superior to a single prognostic factor in predicting long-term survival in CRC patients.

The PPI Network and DPP7-Related Pathways and Biological Functions
Based on the STRING (https://string-db.org/, accessed on 30 June 2023) platform, a total of 20 DPP7-interacting proteins are presented. As shown in Figure 6A, the top five nodes with the highest degree centrality were sulfatase-modifying factor 1 (SUMF1), biotinidase (BTD), alpha glucosidase (GAA), sialic acid acetylesterase (SIAE), and fucosyltransferase 11 (FUT11). To understand the DPP7-related pathways and biological functions in CRC, we further analyzed the GO and KEGG pathways using the data obtained from the TCGA database. Cellular composition, molecular function, and biological processes were included in the GO enrichment analysis. Finally, we identified the top 12 enriched GO terms, as shown in Figure 6B. The top four enriched biological process terms were neutrophil degranulation, neutrophil activation, neutrophil activation involved in

The PPI Network and DPP7-Related Pathways and Biological Functions
Based on the STRING (https://string-db.org/, accessed on 30 June 2023) platform, a total of 20 DPP7-interacting proteins are presented. As shown in Figure 6A, the top five nodes with the highest degree centrality were sulfatase-modifying factor 1 (SUMF1), biotinidase (BTD), alpha glucosidase (GAA), sialic acid acetylesterase (SIAE), and fucosyltransferase 11 (FUT11). To understand the DPP7-related pathways and biological functions in CRC, we further analyzed the GO and KEGG pathways using the data obtained from the TCGA database. Cellular composition, molecular function, and biological processes were included in the GO enrichment analysis. Finally, we identified the top 12 enriched GO terms, as shown in Figure 6B. The top four enriched biological process terms were neutrophil degranulation, neutrophil activation, neutrophil activation involved in immune response, and neutrophil mediated immunity. The following cellular composition terms significantly correlated with DPP7 were vacuolar lumen, lysosomal lumen, primary lysosome, and azurophil granule. Moreover, the molecular function enrichment analysis showed that DPP7 was significantly correlated with hydrolase activity, acting on glycosyl bonds, hydrolyzing O-glycosyl compounds, fucosyltransferase activity, transferase activity, and transferring glycosyl groups. Interestingly, the KEGG pathway analysis demonstrated that DPP7 was involved in pathways related to the lysosome, glycosaminoglycan degradation, other glycan degradation, and various types of N-glycan biosynthesis ( Figure 6C). These findings indicated that DPP7 may play a role in the regulation of neutrophil activation, glycan metabolism, and the function of lysosomes. The GSEA results based on the differentially expressed genes in the high-DPP7-expression and low-DPP7-expression samples identified significantly enriched pathways involving neuroactive ligand-receptor interaction (NES = 1.430; p = 0.048; FDR = 0.038, Figure 6D) and olfactory transduction signal (NES = 1.719; p = 0.048; FDR = 0.038, Figure 6E).
( Figure 6C). These findings indicated that DPP7 may play a role in the regulation of neutrophil activation, glycan metabolism, and the function of lysosomes. The GSEA results based on the differentially expressed genes in the high-DPP7-expression and low-DPP7expression samples identified significantly enriched pathways involving neuroactive ligand-receptor interaction (NES = 1.430; p = 0.048; FDR = 0.038, Figure 6D) and olfactory transduction signal (NES = 1.719; p = 0.048; FDR = 0.038, Figure 6E).

Correlation between DPP7 and Drug Response
To identify the correlation between DPP7 expression and predicted drug efficacy, 261 small molecules in 860 cell lines from GDSC and 481 small molecules in 1001 cell lines from CTRP were collected and analyzed with respect to the DPP7 mRNA expression level

Correlation between DPP7 and Drug Response
To identify the correlation between DPP7 expression and predicted drug efficacy, 261 small molecules in 860 cell lines from GDSC and 481 small molecules in 1001 cell lines from CTRP were collected and analyzed with respect to the DPP7 mRNA expression level (Table S2). The top 30 drugs were ranked using the integrated level of correlation coefficient and FDR of the searched genes, which had to be >0.1 with an FDR < 0.05. We then multiplied the -log10FDR and the absolute value of the correlation coefficient to obtain a score for each drug-gene pair. The total score for each drug was obtained by adding up the scores for all the pairs associated with that drug. The resulting data are visualized in Figure 7, where the genes and drugs are ranked according to their scores. Positive correlations between DPP7 expression and drug sensitivity are represented by red bubbles, which means that the higher DPP7 expression is, the more resistant the gene is to the indicated drug. Negative correlations between DPP7 expression and drug sensitivity are denoted by blue bubbles, which means that the higher the DPP7 level is, the less resistant the gene is to the indicated drug. The level of correlation is indicated by the color intensity, with deeper colors signifying higher correlations. The size of the bubble is positively correlated with the FDR significance of the association. The black outline border in the plot indicates an FDR of ≤0.05. As Figure 7 shows, the expression of DPP7 exhibited a negative correlation with the efficacy of several drugs, such as fluorouracil, dabrafenib, and selumetinib, frequently prescribed for colon cancer, providing a potential link between DPP7 expression and the sensitivity of cancer cells to various drugs. dicated drug. Negative correlations between DPP7 expression and drug sensitivity are denoted by blue bubbles, which means that the higher the DPP7 level is, the less resistant the gene is to the indicated drug. The level of correlation is indicated by the color intensity with deeper colors signifying higher correlations. The size of the bubble is positively correlated with the FDR significance of the association. The black outline border in the plot indicates an FDR of ≤0.05.
As Figure 7 shows, the expression of DPP7 exhibited a negative correlation with the efficacy of several drugs, such as fluorouracil, dabrafenib, and selumetinib, frequently prescribed for colon cancer, providing a potential link between DPP7 expression and the sensitivity of cancer cells to various drugs.

DPP7 Is Highly Expressed in Patients with Colorectal Cancer and Associated with a Poor Prognosis
To further investigate the expression level in human colorectal cancer, we used qPCR to detect the expression level of DPP7 in colorectal tumor tissues and their paired adjacent normal tissues at the mRNA level. As Figure 8A shows, DPP7 was significantly increased in the tumor compared with the adjacent non-tumor tissues. Next, we used a tissue microarray to determine DPP7 expression at the protein level and evaluate the clinical relevance of DPP7. We found that DPP7 was expressed higher in the tumor range than in the paired adjacent non-tumor tissues ( Figure 8B,C), and the increased expression of DPP7 predicted shorter overall survival ( Figure 8D). Therefore, DPP7 is aberrantly overexpressed in colorectal tumors and is closely associated with the poor prognosis of patients with colorectal cancer.

DPP7 Is Highly Expressed in Patients with Colorectal Cancer and Associated with a Poor Prognosis
To further investigate the expression level in human colorectal cancer, we used qPCR to detect the expression level of DPP7 in colorectal tumor tissues and their paired adjacent normal tissues at the mRNA level. As Figure 8A shows, DPP7 was significantly increased in the tumor compared with the adjacent non-tumor tissues. Next, we used a tissue microarray to determine DPP7 expression at the protein level and evaluate the clinical relevance of DPP7. We found that DPP7 was expressed higher in the tumor range than in the paired adjacent non-tumor tissues ( Figure 8B,C), and the increased expression of DPP7 predicted shorter overall survival ( Figure 8D). Therefore, DPP7 is aberrantly overexpressed in colorectal tumors and is closely associated with the poor prognosis of patients with colorectal cancer.

Discussion
DPP7 is known to be correlated with disease and cancer, but the prognostic value of DPP7 in CRC remains unclear. This study found that DPP7 expression in CRC samples was significantly higher than that in normal tissues by analyzing public CRC data and clinical bio-specimens. Furthermore, the enhanced expression level of DPP7 is correlated with high-grade tumors and low overall survival, indicating the diagnostic value of DPP7 for CRC.
In the present study, we first explored the prognostic value of DPP7 in the TCGA and validated our findings based another 98 CRC specimens and 148 normal tissues from the GSE44076 database. Both the mRNA and protein levels of DPP7 were significantly higher in the colon cancer samples than in the normal tissues. After verifying the differential expression of DPP7 in the colon cancer and adjacent tissues, we further analyzed the prognostic and diagnostic value of DPP7. Several studies have explored the prognostic value of DPP7 in cancers. Our results demonstrated that a high expression of DPP7 was associated with a poor prognosis, including a shorter OS, PFI, and DSS for patients with CRC. Our results are consistent with a previous study of CRC. Ahluwalia P et al. [21] established a high prognostic score, composed of four gene signatures including DPP7, which can predict a poor prognosis in CRC patients. However, a recent study found that a high expression level of DPP7 was associated with a good prognosis in breast cancer [10]. Comparatively, DPP7 may play different prognostic roles in different cancer types, and our study first clarified the prognostic value of DPP7 as a single gene in CRC. We further constructed a nomogram incorporating the expression level of DPP7 and another three clinicopathologic variables including age, lymphatic invasion, and residual tumor. The positions of these variables were accumulated and recorded as the total points. According

Discussion
DPP7 is known to be correlated with disease and cancer, but the prognostic value of DPP7 in CRC remains unclear. This study found that DPP7 expression in CRC samples was significantly higher than that in normal tissues by analyzing public CRC data and clinical bio-specimens. Furthermore, the enhanced expression level of DPP7 is correlated with high-grade tumors and low overall survival, indicating the diagnostic value of DPP7 for CRC.
In the present study, we first explored the prognostic value of DPP7 in the TCGA and validated our findings based another 98 CRC specimens and 148 normal tissues from the GSE44076 database. Both the mRNA and protein levels of DPP7 were significantly higher in the colon cancer samples than in the normal tissues. After verifying the differential expression of DPP7 in the colon cancer and adjacent tissues, we further analyzed the prognostic and diagnostic value of DPP7. Several studies have explored the prognostic value of DPP7 in cancers. Our results demonstrated that a high expression of DPP7 was associated with a poor prognosis, including a shorter OS, PFI, and DSS for patients with CRC. Our results are consistent with a previous study of CRC. Ahluwalia P et al. [21] established a high prognostic score, composed of four gene signatures including DPP7, which can predict a poor prognosis in CRC patients. However, a recent study found that a high expression level of DPP7 was associated with a good prognosis in breast cancer [10]. Comparatively, DPP7 may play different prognostic roles in different cancer types, and our study first clarified the prognostic value of DPP7 as a single gene in CRC. We further constructed a nomogram incorporating the expression level of DPP7 and another three clinicopathologic variables including age, lymphatic invasion, and residual tumor. The positions of these variables were accumulated and recorded as the total points. According to the nomogram, DPP7 expression contributed the most extreme data points (ranging from 0 to 100) compared with the other clinical variables within the nomogram, which was consistent with the results of the multivariate Cox regression.
To identify the potential relationships between DPP7 and other proteins in CRC, we conducted a PPI network analysis and identified the top five nodes with the highest degree centrality: sulfatase-modifying factor 1 (SUMF1), biotinidase (BTD), alpha glucosidase (GAA), sialic acid acetylesterase (SIAE), and fucosyltransferase 11 (FUT11). Recent studies have shown that most of the abovementioned genes are associated with the pathogenesis, progression, and prognosis of various cancers [22][23][24][25][26][27]. We further conducted enrichment analyses, and the results demonstrated that DPP7 was significantly correlated with fucosyltransferase activity. The previous work of Deschuyter M et al. proved that the overexpression of O-fucosyltransferase 1 in many cancers, including CRC, leads to NOTCH signaling dysregulation associated with carcinogenesis [28]. These results suggest that DPP7 may affect the expression of O-fucosyltransferase 1 in CRC by regulating the activity of fucosyltransferase and thereby alter the biological features of tumors. Furthermore, we integrated the potential biological functions of DPP7-expression-related genes with a series of enrichment analyses and identified the potential impacts of "neutrophil activation" and "neutrophil mediated immunity" on the etiology or pathogenesis of CRC. The role of DPP7 on neutrophil-associated immune responses has not been reported. However, DPP1, another member of the dipeptidyl peptidase family, activates neutrophil serine proteases to promote neutrophil maturation [29]. Inhibitors of DPP1, such as brensocatib, were used to treat bronchiectasis in several clinical trials [30][31][32]. There is increasing evidence to show that tumor-associated neutrophils (TANs) play an important role in cancer progression. Previous studies have extensively described TAN as a key driver of cancer progression, including cancer proliferation, aggressiveness, and dissemination, as well as immune suppression [33,34]. However, such studies only focused on the late stages of tumorigenesis, in which chronic inflammation is most prominent. The role of TANs in early tumor stages remains poorly understood. The question of whether DPP7 regulates TAN activation and immune regulation in the early tumor stages of CRC needs further study and verification.
The expression of DPP7 exhibited a negative correlation with the efficacy of several drugs, such as fluorouracil, dabrafenib, and selumetinib, frequently prescribed for colon cancer. In summary, our study generated valuable insights into the link between the expression of specific genes and the sensitivity of cancer cells to various drugs. The robust analytical approach used in this study has the potential to inform future drug development efforts by identifying novel drug targets and biomarkers that could enhance patient outcomes.
To the best of our knowledge, the present study was the first to identify DPP7 as a promising diagnostic biomarker and an independent prognostic factor for CRC. However, some limitations need to be considered. First, future studies need to validate the nomogram we proposed. Second, we need to carry out further studies on cell lines and animal models to elucidate the biological function and mechanism of DPP7 in colorectal cancer cells.

Conclusions
DPP7 may serve as a novel diagnostic and prognostic biomarker for CRC, probably via its potential role in fucosyltransferase activity. It is plausible that DPP7 may serve as a promising therapeutic target for CRC. The regulative mechanism of DPP7 in colorectal cancer cells deserves further investigation.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/cancers15153954/s1, Figure S1. The expression of DPP7 in normal and CRC primary tissues based on GSE44076 dataset. Table S1. The clinicopathological features in CRC patients based on the TCGA COAD and READ databases. Table S2. Correlation between mRNA expression of DPP7 and cancer drug sensitivity of GDSC and CTRP.